A Fast Algorithm Combining FP-Tree and TID-List for Frequent Pattern Mining

نویسندگان

  • Lan Vu
  • Gita Alaghband
چکیده

Finding frequent patterns plays an essential role in mining associations, correlations, and many other interesting relationships among variables in transactional databases. The performance of a frequent pattern mining algorithm depends on many factors. One important factor is the characteristics of databases being analyzed. In this paper we propose FEM (FP-growth & Eclat Mining), a new algorithm that utilizes both FP-tree (frequent-pattern tree) and TID-list (transaction ID list) data structures to discover frequent patterns. FEM can adapt its behavior to the dataset properties to efficiently mine short and long patterns from both sparse and dense datasets. We also suggest a combination of several optimization techniques for effectively implementing FEM to speed up the mining process. The experimental results show that a significant improvement in performance is achieved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Fast Vertical Method for Mining Frequent Patterns

Vertical mining methods are very effective for mining frequent patterns and usually outperform horizontal mining methods. However, the vertical methods become ineffective since the intersection time starts to be costly when the cardinality of tidset (tid-list or diffset) is very large or there are a very large number of transactions. In this paper, we propose a novel vertical algorithm called P...

متن کامل

A Compact FP-Tree for Fast Frequent Pattern Retrieval

Frequent patterns are useful in many data mining problems including query suggestion. Frequent patterns can be mined through frequent pattern tree (FPtree) data structure which is used to store the compact (or compressed) representation of a transaction database (Han, et al, 2000). In this paper, we propose an algorithm to compress frequent pattern set into a smaller one, and store the set in a...

متن کامل

Improved algorithm for mining maximum frequent patterns based on FP-Tree

Mining association rule is an important matter in data mining, in which mining maximum frequent patterns is a key problem. Many of the previous algorithms mine maximum frequent patterns by producing candidate patterns firstly, then pruning. But the cost of producing candidate patterns is very high, especially when there exists long patterns. In this paper, the structure of a FP-tree is improved...

متن کامل

The Frequent Pattern List: Another Framework for Mining Frequent Patterns

The mining of frequent patterns (or frequent itemsets) plays an essential role in many tasks of data mining. One major methodology for mining frequent patterns is the Apriori-based approach, which is computationally costly because many candidate itemsets have to be generated and verified. More recently, another approach using the Frequent-Pattern Tree (FP-tree) have been suggested to avoid the ...

متن کامل

Novel Techniques to Reduce Search Space in Periodic-Frequent Pattern Mining

Periodic-frequent patterns are an important class of regularities that exist in a transactional database. Informally, a frequent pattern is said to be periodic-frequent if it appears at a regular interval specified by the user (i.e., periodically) in a database. A pattern-growth algorithm, called PFP-growth, has been proposed in the literature to discover the patterns. This algorithm constructs...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011